perm filename 0[0,BGB] blob sn#101480 filedate 1974-05-13 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	
C00015 00003	
C00020 ENDMK
CāŠ—;

	The  algorithm   consists  of   five  steps:  digital   image
thresholding,   binary image  contouring,  polygon  nesting,  polygon
smoothing,  and polygon comparing.

	The acronym CRE stands for "Contour,  Region,  Edge" and
for  "Cart's  Eye". CRE  is  a  solution to  the  problem  of finding
contour  edges in  a  set  of  television  pictures  and  of  linking
corresponding edges  from one picture  to the  next.  The  process is
automatic   and  is  intended  to  run  without  human  intervention.
Furthermore,   the process  is bottom  up; there  are no  significant
inputs other than the given  television images.  The output of CRE is
a 2D  contour map  data structure  which is  suitable input  to a  3D
geometric modeling program.

	The overall design  goal for CRE  was to build a  region edge
finding  program that could  be applied  to a sequence  of television
pictures and that  would output a sequence  of line drawings  without
having to know anything about  the content of the images. Furthermore
it was  desired that the line drawings be structured.  The six design
choices that determined the character of CRE are:

	1. Dumb vision rather than model driven vision.
	2. Multi image analysis rather than single image analysis.
	3. Total image structure imposed on edge finding; rather
	   than separate edge finder and image analyzer.
	4. Automatic rather than interactive.
	5. Fixed image window size rather than variable window size.
	6. Machine language rather than higher level language.

	The design  choices are  ordered from  the more strategic  to
the   more  tactical;   the  first   three  choices   being  research
strategies, the  latter  three  choices  being  programming  tactics.
Adopting these  design choices lead  to image contouring  and contour
map structures similar to that of Krakauer[3] and Zahn[4].

	The first design  choice does not  refer to the issue  of how
model dependent a  finished general vision system will be (it will be
quite model dependent),   but rather to the  issue of how one  should
begin  building such  a system.   I  believe that  the best  starting
points are  at the two apparent extremes of nearly total knowledge of
a particular  visual  world or  nearly total  ignorance.   The  first
extreme involves  synthesis (by computer graphics) of  a predicted 2D
image, followed by comparing the predicted and a perceived image  for
slight differences  which are  expected but  not yet  measured.   The
second  extreme involves  anaylsing perceived images  into structures
which can  be readily  compared for  near equality  and measured  for
slight differences;  followed by the  construction of a  3D geometric
model of  the perceived world. The point is that in both cases images
are compared,  and in both cases the 3D  model initially (or finally)
contains specific  numerical data on the geometry  and physics of the
particular world being looked at.
~I1973,835;~F8- 2 -~

	The  second design  choice, of  multi  image anaylsis  rather
than single  image analysis, provides a basis  for solving for camera
positions and feature  depths.   The third design  choice solves  (or
rather avoids)  the problem of  integrating an edge  finder's results
into  an image. By using a very  simple edge finder, and by accepting
all the edge found, the  image structure is never lost.   This design
postpones the  problem of interpreting photometric  edges as physical
edges.

	The fourth  choice  is a  resolution not  to  write an  image
processor  that requires  operator  assistance and  parameter tuning.
The fifth choice of the  216 by 288 fixed  window size is a sin  that
proved  surprisingly expedient,  it  is explained  later. A  variable
window  version of CRE at halves,   thirds and other simple fractions
of its present window size will be made at some future date.

	The  final design choice  of using  machine language  was for
the  sake  of  implementing  node  link  data  structures  that   are
processed 100 faster  than LEAP, 10  times faster than  compiled LISP
and  that require significantly  less memory  than similar structures
in either LISP or LEAP. Furthermore machine code assembles  and loads
faster  than  higher  level  languages;   and  machine  code  can  be
extensively fixed and altered without recompiling.

	It  is  my  impression  that  CRE  does  not  raise  any  new
scientific problems;  nor does it  have any  really new solutions  to
the old  problems; rather CRE is another  competent video region edge
finding program with its  own set of tricks.  However, it is  further
my impression that the particular tricks for  smoothing,  nesting and
comparing polygons in CRE are original as programming techniques.